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Abstract 

This paper considers the class of deterministic continuous-time optimal control 
problems (OCPs) with piecewise-affine (PWA) vector field, polynomial Lagrangian 
and semialgebraic input and state constraints. The OCP is first relaxed as an 
infinite-dimensional linear program (LP) over a space of occupation measures. This 
LP, a particular instance of the generalized moment problem, is then approached 
by an asymptotically converging hierarchy of linear matrix inequality (LMI) relax- 
ations. The relaxed dual of the original LP returns a polynomial approximation of 
the value function that solves the Hamilton-Jacobi-Bellman (HJB) equation of the 
OCP. Based on this polynomial approximation, a suboptimal policy is developed to 
construct a state feedback in a sample-and-hold manner. The results show that the 
suboptimal policy succeeds in providing a stabilizing suboptimal state feedback law 
that drives the system relatively close to the optimal trajectories and respects the 
given constraints. 

1 Introduction 

Piecewise-affine (PWA) systems are a large modeling class for nonlinear systems. However 
most of the nonlinear control theory does not apply to PWA systems because it requires 
certain smoothness assumptions. On the other hand, linear control theory cannot be 
simply employed due to the special properties of PWA systems inherited from nonlinear 
systems. PWA systems can naturally arise from linear systems in the presence of state 
saturation or from simple hybrid systems with state-based switching where the continuous 
dynamics in each regime are linear or affine [TU]. Many engineering systems fall in this 
category, like power electronics converters for example. In addition, common electrical 
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circuits components as diodes and transistors are naturally modeled as piecewise-linear 
elements. PWA systems are also used to approximate large classes of nonlinear systems 
as in |TT] , [19] , and [I] . These approximations are then used to pose the controller design 
problem of the original nonlinear system as a robust control problem of an uncertain 
nonlinear system as suggested in [21, 22J. 

Problems of piecewise-affine systems are known to be challenging. The problems have a 
complex structure of regions stacked together in the state-space with each region contain- 
ing an affine system. Any approach must identify the behavior in each region and then 
link them together to form a global picture of the dynamics. In [2], it has been shown 
that even for some simple PWA systems the problem of analysis or design can be either 
NP-hard or undecidable. 

The motivation behind the research in this paper is the need of optimal control synthesis 
methods for continuous-time PWA systems with both input and state constraints. Addi- 
tionally, there is often a need to find suitable tools for the design of stabilizing feedback 
controllers under state and input constraints. Over the last few years, there were several 
attempts addressing the synthesis problem for continuous-time PWA systems. The tech- 
niques are based on analysis methods and use convex optimization. These methods result 
in a state-based switched linear or affine controllers. For example, in [H] a piecewise-linear 
state-feedback controller synthesis is done for piecewise-linear systems by solving convex 
optimization problem involving LMIs. The method is based on constructing a globally 
quadratic Lyapunov function such that the closed-loop system is stable. Similarly, in [18] 
a quadratic performance index is suggested to obtain lower and upper bounds for the 
optimal cost using any stabilizing controller. However, the optimal controller is not com- 
puted. The method assumes a piecewise-affine controller structure which can be shown 
not to be always optimal (see section [5]) . In [20] , the work done in [8] is extended to 
obtain dynamic output feedback stabilizing controllers for piecewise-affine systems. It 
formulates the search for a piecewise-quadratic Lyapunov function and a piecewise-affine 
controller as a nonconvex Bilinear Matrix Inequality (BMI), which is solved only locally 
by convex optimization methods. More recently in [T2] , a nonconvex BMI formulation is 
used to compute a state feedback control law. 

For constrained PWA systems in discrete-time where both the partitions and the con- 
straints are polyhedral regions, pQ combines multi-parametric programming, dynamic 
programming and polyhedral manipulation to solve optimal control problems for linear or 
quadratic performance indices^} The resulting solution when applied in receding horizon 
fashion guarantees stability of the closed-loop system. 

To the best of the authors' knowledge there are no available guaranteed methods for 
synthesis of optimal controllers in continuous-time for PWA systems that consider general 
semi-algebraic state-space partitions, or that do not restrict the controller to be piecewise- 
affine, or do not require the performance index to be quadratic or piecewise-quadratic. 

The technique presented in this paper provides a systematic approach, inspired by [HUES], 
to synthesize a suboptimal state feedback control law for continuous-time PWA systems 
with multiple equilibria based on a polynomial approximation of the value function. The 
OCP is first formulated as an infinite-dimensional linear program (LP) over a space of 
occupation measures. The PWA structure of the dynamics and the state-space partition 

: We are grateful to Michal Kvasnica for pointing out this reference. 
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are then used to decompose the occupation measure of the trajectory into a combina- 
tion of local occupation measures, one measure for each partition cell. Then, the LP 
formulation can be written in terms of only the moments of the occupation measures 
(countably many variables). This allows for a numerical solution via a hierarchy of con- 
vex LMI relaxations with vanishing conservatism which can be solved using off-the-shelf 
SDP solvers. The relaxations give an increasing sequence of lower bounds on the optimal 
value. An important feature of the approach is that state constraints as well as any input 
constraints are very easy to handle. They are simply reflected into constraints on the 
supports of the occupation measures. It turns out that the dual formulation of the orig- 
inal infinite-dimensional LP problem on occupation measures can be written in terms of 
Sum-of-Squares (SOS) polynomials, that when solved yields a polynomial subsolution of 
the Hamilton- Jacobi-Bellman (HJB) equation of the OCP. This gives a good polynomial 
approximating value function along optimal trajectories that can be used to synthesize a 
suboptimal, yet admissible, control law. The idea behind the developed suboptimal policy 
exploits the structure of the HJB equation to generate the optimal control trajectory. The 
right-hand side of the HJB equation is iteratively minimized to construct a state feedback 
in a sample-and-hold manner with stabilization and suboptimality guarantees. 

2 The piecewise-affine optimal control problem 

In this section, we first introduce the piecewise-affine continuous-time model, and then 
formulate the optimal control problem using some important assumptions. 

2.1 Piecewise-affine systems 

We consider exclusively continuous-time PWA systems. The term PWA is to be under- 
stood as PWA in the system state x. The state-space is assumed to be partitioned into a 
number of cells Xj such that the dynamics in each cell takes the form 

x = AiX + cii + BiU for x G Xi, i G / (1) 

where / is the set of cell indices, and the union of all cells is X = Uj e /Xj C M. n . The 
global dynamics of the system depends on both the cells and the corresponding local 
dynamics. The matrices Ai, cii, and Bi are time independent. In general, the geometry 
of the partition Xi can be arbitrary. However, to arrive at useful results we assume the 
cells to be compact basic semi-algebraic sets (intersection of polynomial sublevel sets) 
with disjoint interiors. They are allowed to share boundaries as long as these boundaries 
have Lebesgue measure zero in X. There are many notions of solutions for PWA systems 
with different regularity assumptions on the vector field. The concern here is to ensure 
the uniqueness of the trajectories. Systems with discontinuous right-hand sides can have 
attracting sliding modes, non-unique trajectories, or trajectories may not even exist in 
the classical sense [5], [TU]. We assume that the PWA system is well-posed in the sense 
that it generates a unique trajectory for any given initial state. This is guaranteed if we 
assume that the global vector field is Lipschitz. It is usually the case if the model is the 
result of approximating a nonlinear function. 
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2.2 Problem formulation 



Optimal control problems (OCPs) of PWA systems are usually Lagrange problems where 
the state-space is partitioned into a finite number of cells. In addition to the dynamics, 
the cost functional can be also defined locally in each cell. The objective is to find optimal 
trajectories starting from an initial set and terminating at a target set that minimize the 
running cost and respect some input and state constraints. Consider the following general 
free-time PWA OCP with both terminal and running costs 

?T 

v*(x ) =inf L T (x T ) + V / Li(x(t),u(t))l[xMt))dt 



f 

--mi L T (x T ) + V / 



s.t. x = Aix(t) + cii + Biu(t), x(t) G Xi 

i = l,...,r te[0,T] (2) 

x (o) = x ex c W 1 , 

x(T) = x T eX T c R n , 

(ar(t), u(t)) e X x U C R n x R m . 

where the scalar mapping ^L Xi : Xi — > {0, 1} is the indicator function defined as follows: 



l, if x(t) e Xi 
0, if x(t) i Xi 



The infimum in ^ is sought over all admissible control functions «(•) with free final time 
T. The dynamic programming approach of optimal control reduces the above problem to 
the problem of solving the following system of Hamilton- Jacobi-Bellman (HJB) equations 
for the value function v*: 

inf {Vv*(x) ■f i (x,u) + Li(x,u)} = 

ueu ( 3 ) 
V(x, u) € Xi X U, Vi = 1, . . . , r 

with the terminal condition 

v\x(T))=L T (x\T)). 

In full generality, solving the HJB equations is very hard and the value function is not 
necessarily differentiable. Therefore, solutions must be interpreted in a generalized sense. 

To proceed, we adopt the following assumptions: 

• The terminal time T is finite and the control functions u(-) are measurable. 

• The PWA system is well-posed in the sense that it generates a unique trajectory 
from every initial state, and the vector field is locally Lipschitz. 

• The Lagrangians and the terminal cost are polynomial maps, namely Li e M.[x, u] Vz 
and L T eR[x]. 

• The cells X iy the control set U, the sets X and X T are compact basic semi-algebraic 
sets defined as follows: 

Xi x U = {(x, u) e R n x R m | pi tk (x, u) > 0, 
Wk = 1, . . . ,m,i}, i = l,...,r 
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X = {x E R n | p , k (x) > 0, Vfc = 1, . . . , m }, 
X T = {x e W n \p T ,k{ x ) > 0,VA; = l,...,m T }. 



(5) 



• The cells Xi have disjoint interior and they are allowed to share boundaries as long 
as these boundaries have Lebesgue measure zero in X. 

3 The moment approach 

In this section, we formulate the nonlinear and nonconvex PWA OCP @ into a con- 
vex infinite-dimensional optimization problem over the state-action occupation measure 
fi. The problem is then approached by an asymptotically converging hierarchy of LMI 
relaxations to arrive at a polynomial approximation of the value function. 

3.1 Occupation measures 

Occupation measures are used to deal with dynamic objects where time is involved. We 
focus on the application of occupation measures to dynamic control systems where the 
dynamic objects are ordinary differential equations. Starting by the PWA vector field, we 
generate a sequence of moments by writing the ODEs in terms of occupation measures. We 
then manipulate the measures through their moments to optimize over system trajectories 
using a linear matrix inequality (LMI) representation. 

To illustrate the measures formulation, first consider the uncontrolled autonomous dy- 
namic system defined by the Lipschitz vector field / : W 1 — y M. n and the nonlinear 
differential equation 



We think of the initial state random variable in M. n modeled by a probability 

measure fi supported on the compact set X , i.e. a nonnegative measure fi such that 
Ho{Xq) = 1. Then at each time instant t, the state x(t) can also be seen as a random 
variable ruled by a nonnegative probability measure fit- Solving ^ for the state trajectory 
x[t) yields a family of trajectories starting in Xq and ending at a final set Xt- The measure 
fit of a set can then be thought of the ratio of the volume of trajectory points that lie 
inside that set at time t to the total volume of points at time t. In particular, if the 
number of trajectory points one is considering is finite, then the measure fit of a set is 
nothing else than the ratio of trajectory points that lie in the set at time t to the total 
number of trajectory points. The family of measures fit can thus be thought of as a 
density of trajectory points and satisfies the following linear first-order continuity PDE 
in the nonnegative probability measures space 



The above equation is known as Liouville's equation or advection PDE, in which V- {f fit) 
denotes the divergence of measure ffi t in the sense of distributions. It describes the linear 
transport of measures from the initial set to the terminal set. 





(6) 




(7) 
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The occupation measure of the solution over some subset T x X in the a-algebra of 
[0, T] x X is simply defined as the time integration of m as follows 

/i(TxAr):= y fi t (X)dt = J l x (x{t))dt, (8) 

where the last equality is valid when x(t) is a single solution to equation It is 

important to note that when the initial condition xq is deterministic, the occupation 
measure /i is the time spent by the solution x(t) in the subset X when t G T. We see 
that the occupation measure can indicate when the solution is within a given subset. 



3.2 The primal formulation 

In the sigma-algebra of Borel sets (smallest sigma-algebra that contains the open sets) let 
M.(X) denote the space of signed Borel measures supported on a compact subset X of 
the Euclidean space. Furthermore let C(X) be the space of bounded continuous functions 
on X, equipped with the supremum norm. Then, the space J\4(X) is the dual space of 
C(X) with the following duality bracket 

(v,n) = [ vdfx, V(t»,/i) G C{X) x M{X). (9) 
Jx 

Now consider the piecewise-affine optimal control problem (|2]), and assume that the control 
trajectory u(t) is admissible such that all the constraints of the OCP are respected. 
Accordingly, we define the state-action local occupation measure (including sets on the 
space of control inputs) associated with the cell Xj to be 

fi t ([0,T}xX t xU)= [ l XtXU (x(t),u(t))dt. (10) 

Jo 

The local occupation measure /ij(Xj x U) encodes the trajectories in the sense that it 
measures the total time spent by the trajectories (x(t),u(t)) in the admissible set Xi x 
U . Furthermore, define the global state-action occupation measure for the trajectory 
(x(t),u(t)) as a linear combination of local occupation measures such that 

M[0,T] xXxU)= [ 1L XxU {x{t),u{t))dt 

J °r (ID 
= ^2fH(XiXU). 

i=i 

The indicator function "tt. X xu( x (t), u (t)) equals 1 through the interval [0,T]. The occu- 
pation measure of the whole state space is given by the terminal time T. In general we 
should make sure that T is finite, otherwise the occupation measure of the whole state 
space may escape to infinity. The initial and terminal occupation measures are probability 
measures supported on X and X T , respectively. 

The objective function of problem Q can now be rewritten in terms of these measures 
to get the following linear cost 

J(x ,u(t)) = / Lid/j,i+ / L T dfi T . (12) 

,_i JXiXU Jx T 
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With duality brackets, it reads 

r 

J(x ,u(t)) = y^Lj,^) + (L T ,n T ). (13) 



If the Lagrangian is the same for all the cells, say L, the performance measure can be 
written in terms of the global state-action occupation measure as 

J(xo,u(t)) = / Ld[i+ / LxdfiT. (14) 
Jxxu Jx T 

Our next step is to determine the measure transport equation that encodes the PWA 
dynamics in the measure space. To do that, we define a compactly supported global test 
function v G C l (X). Then for i — 1, . . . , r we define a linear map F t : C x (Xj) — > C(Xi x U) 

Fi(v) ■■= -v = -Vv ■ (AiX + di + B iU ). (15) 

Integration gives the following 



/ dv = — \^ / Fi(v)dfj, 
Jo i=l Jx.xu 



vdfiT — / vdfio. 



(16) 



Equivalently, we can write 

r 

J2(Fi(v),Ln) + (v,fi T ) = (v,» )Vv e C\X). (17) 



i=i 



Now define the following (r+l)-tuple, in which we gather all the local occupation measures 
as the first r elements, and the terminal occupation measure as the last element 

V ■= (ji u . . . ,/i r ,/i T ). 

The piecewise-affine optimal control problem ^ is equivalent to the following infinite- 
dimensional linear optimization problem over occupation measures: 

r 

P* = mf V7L;,/!;) + (L T ,n T ) 
v ' * 

i=l 

(18) 

s.t. 2_^(Fi(v),Hi) + (v,n T ) =(v,fi ) 

i=l 

Vt- e C\X). 

Furthermore, defining the linear mapping C : C l (X) — > nj=i^PQ x C(Xt), as C(v) = 



(Fi(v), . . . , F r (v), v) we can rewrite the constraint in (18) as 



((Fi(v), F r (v),v), v) = (C(v),u) 

= (v,C m {v)) = (v,Ho), (19) 
Vu e C\X). 



This defines the adjoint map C* : Yll ==1 M(Xi) x M(X T ) ->■ .M*(X). The measure 
transport equation (J7l) is then given by [7j 

= fi = V ■ (Jim) + /it (20) 

i=l 

with the symbol /j denoting the dynamics in the cell with index % — 1, . . . , r. 

Finally, define the tuple c := (Li, . . . , L r , Lt), and associate the piecewise-affine optimal 
control problem (J5]) to the following infinite-dimensional linear program 

p* = inf (c, v) 

s.t. £»= (21) 

i/ b o. 

This is the primal formulation of the OCP in terms of occupation measures of the trajec- 
tory (x(t),u(t)). The nonlinear nonconvex PWA OCP ^ is therefore reformulated as an 
infinite-dimensional LP in the measure space. 



3.3 Moments and LMI relaxations 

The moments of the occupation measure /i are defined by integration of monomials with 
respect to /i . The a-th moment of fi over the support X is given by 

y a = I x a d^ VoeN". (22) 
Jx 

where x a = YYi=i x T- Noting that dfj,(x) = /i(dx) and using the definition (|8]) of 
moments can be rewritten as 

y a = f [x(t)] a dt, Va e N n 
Jo 

where x(t) denotes the solution of the ODE starting at xq- This follows from the definition 



of fi. Therefore, if we can find the moments and handle the representation conditions (22), 
solving the moments gives the solution of the ODE because the infinite (but countable) 
number of moments uniquely characterize a measure (on a compact set). 



Note that LPs (18) and (21) are equivalent. To proceed numerically we restrict the 
continuously different iable functions to be polynomial functions of the state. In other 
words, we consider v G M.[x] C C l (X). By this restriction, we obtain an instance of 
the generalized moment problem (GMP), i.e. an infinite-dimensional linear program over 
moments sequences corresponding to the occupation measures. It turns out [15j Ch3] that 
if the supports of the measures are compact basic semi-algebraic sets, the GMP can be 
approached using an asymptotically converging hierarchy of LMI relaxations. 

To write the semidefinite relaxation of the primal infinite-dimensional LP, let = (yi a ), 
a G N n x N m be the moments sequence corresponding to the local occupation measure 
fii, i = 1, . . . ,r. Moreover, let yo = {yo p ) and yr = (yTp) with (3 G N n be the moment 
sequences corresponding to fio and \i? respectively. Given any infinite sequence y = (y a ) 
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of real numbers with a G N n , define the linear functional £ : M[x] — > R that maps 
polynomials to real numbers as follows 

In each LMI relaxation we truncate the infinite moment sequence to a finite number of 
moments. The LMI relaxation of order d, including moments up to 2c?, of the GMP 



instance (21) can be formulated by taking test functions v = x a with a G N n , such that 



deg v = 2d, as follows 

r 

p d = inf J2 £ vM + £ vt( L t) 

Vt,—,Vr,VT TT 
1=1 

r 

S-t. J2 e v*( Fi ( V V +e VrW =e v°( V 



i 

' 1 (23) 



Afd(l/i) bO,Vi 
M d (p ijk y t ) y 0, Vz, VA; = 1, . . . , m g , 
M d (y T ) b 0, 

M d (p T ,k Ut) h 0, V& = 1, . . . , tot- 

The minimum relaxation order has to allow the enumeration of all the moments appear- 
ing in the objective function and the linear equality constraint. The matrices M^yi) and 
Mdiyr) are called moment matrices of the local occupation measure and terminal prob- 
ability measures, respectively. Each moment matrix is defined to be a square matrix of 
dimension ( +n ) filled with the first 2d moments corresponding to the representing mea- 
sure. They are linear in the moments. Similarly, the matrices Md{pi^kUi) and M^pt^Ut) 
are linear in the moments and are called localizing matrices. The linear equality con- 



straint represents the peicewise-affine dynamics, and the LMIs ensure that (22) holds and 
that the measures are supported on the given sets defined in |4])-([5]) (see [15] for more 
details). 



3.4 The dual formulation 

The duality between finite measures and compactly supported bounded continuous func- 



tions is captured by convex analysis. The dual of the LP (21) is thus formulated over the 



space of positive bounded continuously differentiable functions as follows 

d* = sup (v, fi ) 

s.t. z = c- C(v) ( 24 ) 
z > 

where z is a vector of continuous functions. The dual LP can be written in more explicit 
form to reveal the structure of the linear constraints. We can equivalently write 

d* = sup (v,fMo) 
vee 1 (x) 

s.t. Li — Fi(v) > 0, Vz = l,...,r, ( 25 ) 
L T — v(xt) > 
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and more explicitly 



d* = sup / vdjj 
vee 1 (X) Jx 

s.t. Vf (x) ■ fi + Li(x, u) > 0, 

V(x, u) G Xi x U, Vi = 1, . . . , r 
L T — v(xt) > 0, Vx G Xt- 



(26) 



We note that any feasible solution of SDP (26) is actually a global smooth subsolution of 



the HJB equations (|3]). By conic complementarity, along the optimal trajectory (x*,u*) 
it holds 

(z*,v*) = 0. (27) 
Therefore, for the optimal dual function v*, the following holds: 



Vv*(x*)-f i + L i (x*,u*) = 0, 
V(x*,u*) eXixU, 
Wi = 1, . . . , r 



and in addition, 



v*(x*(T)) = L T (x*(T)). 
This is an important result. It shows the following: 



(2f 



(29) 



1. With a careful look, we can easily identify what we have in (28) to be the HJB PDE 



of the PWA optimal control problem satisfied along optimal trajectories, with the 



terminal conditions given by ( 29 ) . 



2. The optimal dual function v*(x) is equivalent to the value function of the optimal 
control problem, hence the notation. The maximizer function v*(x) of the dual 



infinite-dimensional LP in equation (24) solves, globally, the HJB equation of the 



PWA optimal control problem along optimal trajectories. 



The dual convex relaxation, dual of LMI (23), is formulated over positive polynomials. 



Putinar's Positivstellensatz [T7] is used to enforce positiveness. Therefore, the unknown 
dual variables are the coefficients of the polynomial v and several SOS polynomials that 
deal with the polynomial positivity conditions of the constraints. The dual program can 
then be written as follows: 

d* d = sup / v d dfx 

Vd,s Jx 



s.t. L { - Fi(v d ) = s ifi + ^ 



Pi,k $i,k 



k=l 

V(x, u) e Xi x U, Vz = 1, . . . , r, 

L T - v d (x) = S Tfi + ^PT,k S T ,k 
k=l 

Vx G X T . 



(30) 
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in which the degree of Vd is d. The polynomials s^o, s^, st,o and ST,fc are positive. They 
are Putinar's SOS representations of the constraint polynomials [17]. The polynomials 
Pi^k define the set X^xU and the polynomials pT,k define the set X T , see equations (|4])-([5]). 

Using weak-star compactness arguments similar to those in the proof of [HI Theorem 
2], we can show that p* = d*. Moreover, the equality p* = v*(xq) holds if we allow 
relaxed controls (enlarging the set of admissible control functions to probability measures) 
and chattering phenomena in OCP (]2j, see e.g. the discussion in [TJl Section 3.2] and 
references therein. Note that in [16], the identity p* = d* = v*(xq) was shown under 
stronger convexity assumptions. 



4 Suboptimal control synthesis 

Assume that the analytical value function v*(x) of a general optimal control problem is 
available by solving the HJB PDE. The optimal feedback control function k*(x(t)) can 
then be selected such that it generates an admissible optimal control trajectory u*(t) 



that satisfies the optimal necessary and sufficient conditions (28). The resulting optimal 
feedback function k*(-) generates the admissible optimal control trajectory starting from 
any initial value Xq. Therefore we get the solution of the OCP as a feedback strategy, 
namely, if the system is at state x, the control is adjusted to k*(x). This approach has 
the difficulty that even if the value function was smooth there would be in general no 
continuous optimal state feedback law that satisfies the optimality conditions for every 
state. 

It is well-known from Brockett's existence theorem that if a dynamic control system with 
some vector field f(x,u) admits a continuous stabilizing feedback law (taking the origin 
as equilibrium point), then for every S > 0, the set f(B(0,5),U) is a neighborhood of 
the origin [3J, where B(0,S) is a ball of radius 5 with a center at zero. One famous 
example where this condition fails is the nonholonomic integrator [5]. More recently 
it was shown that asymptotic controllability of a system is necessary and sufficient for 
state feedback stabilization without insisting on the continuity of the feedback. The 
synthesis of such discontinuous feedbacks was described in [5], together with a definition 
of a solution concept for an ODE with discontinuous dynamics, namely the sample-and- 
hold implementation. It turns out that this concept is very convenient for PWA systems. 
The first paper to deal with sampled-date PWA systems in the form used here can be 
found in reference 



The suboptimal trajectory of the control system is defined by a partition, call it it, 
of the time set [0,T W ] as follows: define ir = {tj}o<j< P with a given diameter d(n) ■= 
sup 0< j <p _ 1 (tj, tj+x) such that = to < t\ ■ ■ • < t p = T n . The sequence tj for 1 < j < p de- 
pends on the evolution of the trajectory and the diameter of the partition d(ir). Starting 
at an initial point tj, the suboptimal trajectory is the classical solution of the Lipschitz 
ODE 

x(t) = Aixit) + a + Bik*(xj), x(t) e X { 

Xj = x(tj^j, t j ^ t ^ 

such that tj + i depends on the evolution of the trajectory. 

The generated suboptimal trajectory corresponds to a piecewise constant open-loop con- 
trol (point-wise feedback) which has physical meaning. The optimal points of the trajec- 
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tory are those points at the beginning of each subinterval at which the control is updated 
using the optimal control function. The suboptimal trajectory converges to the optimal 
trajectory when d(n) — > 0. This corresponds to increasing the sampling rate in the im- 
plementation such that the sampling time T s — > 0. In this case the generated trajectory 
corresponds to a fixed optimal feedback. This can be shown as follows: take a partition 
7r with d(ir) = 5, we can use the mean value theorem to write 

v*(x j+1 ) - v*{xj) = [Vv*(i(t)) • /;(a:(r), w*^))]^! - tj) 
5j = (tj + i — tj) <= 5, and r£ [0, Sj}. 

and we note that when 5j — > 0, 

V*(Xj) ->• Vv*(x(tj) ■ fi{x{tj),u*{xj) 
— —Li(xj,u*(xj)). 

and the algorithm converges to the optimal trajectories. It is then clear that for every 
initial condition xo, and some prescribed g > 0, e > 0; there exist some 5 > and T n > 
such that if d{n) < S the generated suboptimal trajectories starting at xo satisfy 

J(xo, u(t)) — v*(xq) < g (suboptimality gap) 
IKo^T,,-) — Xt)\\ < e (tolerance). 

In the special case of having 5 = 0, we have no gap due to the algorithm and T n = T. 

Assuming that a polynomial approximation of the value function for a given PWA con- 
tinuous system is available by solving the relaxed dual LMI, the proposed suboptimal 
feedback policy [0, T) x W 1 — > U is constructed using closed-loop sampling and Algorithm 

m 

Algorithm 1 Algorithm for Suboptimal Synthesis 
given: d(7i), e, xq G Xj, xt, and poly, approx. of v*(-) 
initialization: t = to = 0, x(0) = xq, j = 0. {first interval} 

while \\x(t) — xt\\ > e do 

solve the static polynomial optimization problem 

u*Axj) = argmin Vv(xj) ■ fi(xj,u) + Li(xj,u) 

repeat 

solve x(t) = Aix(t) + cii + BiU*(xj), starting at tj 
until x(t) ^ Xj or t — tj = d(7i) 
set j 4— j + 1 {next interval} 

set tj <r- t, and Xj x(t). {initial values for next interval} 
determine new region X^ for x(t) and set % ^— k 
end while 

p j {number of intervals in 7r} 

return n = {tj}o<j< p , x(t), and u(t) with < t < T n . 
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5 Numerical Example 



We consider a first-order PWA dynamic system with two cells. The optimization is done 
over both the control action and time horizon (free time). The OCP is defined as follows: 

v*(x ) = inf / (2(x - l) 2 + u 2 )dt 
T < u Jo 

gt ±= [ fi = -x + l + u, xEX 1 (31) 
\ f2 = x + l + u, x E X 2 

x(0) = -1, x(T) = +1 

where v* is the value function, and Xq = x(0) is the initial state. The control u(t) is 
assumed to be a bounded measurable function defined on the interval t G [0, T], and 
taking its values in K. The state space is partitioned into two unbounded regions 

X = Xi U X 2 , where, 
Xi = {x G X | x > 0}, 

x 2 = {x e x | x < o}. 



The control Hamiltonian 

H(xuVv*)-{ Vv*f 1 + (2(x-iy + u%xeX 1 

is used to write the HJB equation. Since the terminal time T is subject to optimization, 
the control Hamiltonian vanishes along the optimal trajectory. 

This gives the HJB PDE 

inf H(x{t),u{t), Vv*{x{t))) =0, Vx(t), te [0,T]. (33) 

An analytical optimal solution can be obtained by solving the HJB equations correspond- 
ing to each cell. This results in a state feedback control 

r (i-vs)(x-i) if x>o 

W I -x - 1 + J2(x - l) 2 + (x + l) 2 if x < 1 J 



that satisfies the HJB PDE (33) for all t G [0, T] and x. The optimal control trajectory 



u*(t) = k*{x*{t)) VtG[0,T] 

is obtained when the optimal state feedback is applied starting at the given initial condi- 
tion. The partial derivative of the value function with respect to the state is a viscosity 



solution of the HJB PDE (33), see e.g. [7j. 



The initial and final states are known. Therefore, the initial and terminal occupation 
measures are Dirac measures. The global occupation measure \i G Ai(X x U) is used 
to encode the system trajectories. It is defined as a combination of two local occupation 



measures, one for each cell, as follows 



[i = /ii + /i 2 



13 



such that the support of /^i is {(x,u) | x G Xi, u G U} and the support of \i2 is {(x,u) \ x E 
X2,u 6(7}. The OCP can then be formulated as an infinite-dimensional LP in measure 
space 



p* = inf / Ld\i\ + / Ldfi 2 
^'^Jx^u Jx 2 ,u 

s.t. / Vv ■ fidm + / Vv ■ f 2 dn 2 = (35) 



vdfiT — / vdfi , \/v 

where \iq and /it are the initial and final measures, supported on Xt = {+1}, X = { — 1}, 
respectively, and L = 2(x — l) 2 + u 2 is the Lagrangian. The functions v are functions of 
the state x and belong to the space of continuously differentiable functions. The Matlab 
toolbox GloptiPoly [9] is used to formulate this infinite-dimensional LP on measures as a 
GMP. Here, we are mainly interested in solving the dual formulation on functions. 

In the present example, both the state space and the input space are not compact. Hence, 
the LMI relaxations do not include any localizing constraints; moreover, Putinar's con- 
ditions are not sufficient for the convergence of the LMI optimal value. The sufficiency 
is guaranteed only for measures on compact support. The main concern here is that the 
mass (zeroth moment) of the occupation measure is not bounded and might tend to in- 
finity. This is numerically problematic. There are two ways to turn this around. The first 
is to make the time interval compact by adding a constraint on the mass of the global 
occupation measure. For example we enforce that the mass of [i is less than a given large 
positive number. This exploits the fact that the vector field is asymptotically stable. 
Another way, which is more general, is to constrain X and U to sufficiently large subsets. 
One possibility is to take X as a large ball centered around (or the initial state Xq). The 
same can be assumed for the input space U. This is done without any problems as long as 
the optimal trajectory remains in the constraint set (X, U). To avoid numerical problems, 
it is also recommended in all cases to scale down the problem, if possible, to have all the 
variables inside the unit box. Scaling avoids blow up of the moments sequences for high 
relaxation orders. 



The results obtained below for the OCP (31) assume that the global occupation measure 
is supported on X x U without introducing any constraints on the spaces. 

The solution consists of two main steps: 

1. Finding a relatively good polynomial approximation of the value function by solving 



the dual LMI relaxation of the infinite-dimensional LP (35) 



2. Employing the suboptimal strategy developed in section |4]to generate a suboptimal 
admissible feedback control law based on the smooth approximation of the value 
function obtained in step 1. 

The first step is achieved by solving the relaxed dual LMI on continuous functions. Figure 
[TJshows the obtained approximation of the value function with LMI relaxation order d = 6. 

The obtained approximation is a polynomial of the state x with degree equal to 2d. 

Based on this approximation, we employ algorithm [T] from section |4j The resulting sub- 
optimal feedback control is shown in figure [2] in comparison with the analytical optimal 
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Figure 1: PWA system. Approximating value function for d = 6. 



feedback, k*(x), calculated using (34). It is clear that the algorithm gives a suboptimal 
feedback close to the optimal. 



6 Conclusions 

The focus of this paper is the synthesis of suboptimal state feedback controllers for 
continuous-time optimal control problems (OCP) with piecewise-affine (PWA) dynam- 
ics and piecewise polynomial cost functions. Both state constraints and input constraints 
are considered in a very convenient way and they do not pose additional complexity. 

The problem is formulated as an abstract infinite-dimensional convex optimization prob- 
lem over a space of occupation measures which is then solved via a converging hierarchy 
of LMI problems. By restricting the dual variables in the dual of the original infinite- 
dimensional program to be monomials, we obtain a polynomial representation of the value 
function of the OCP in terms of upper envelope of subsolutions to a system of HJB equa- 
tions corresponding to the OCP. By fixing the degree of the monomials, the same dual 
program can be relaxed and written, using Putinar's Positivstellensatz, as a polynomial 
sum-of-squares (SOS) program, which can be transformed and solved as an LMI prob- 
lem. As soon as the polynomial approximation of the value function is available, one can 
systematically generate a suboptimal, yet admissible, feedback control. The suboptimal 
control strategy is based on a closed-loop sampling implementation which is very conve- 
nient for PWA systems. The method generates an approximate control signal which is 
piecewise constant, and near optimal trajectories that respect the given constraints. 
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Figure 2: PWA system. Suboptimal feedback for d = 6. 
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